Live Drum Separation Using Probabilistic Spectral Clustering Based on the Itakura-Saito Divergence
نویسندگان
چکیده
We present a live drum separation system for a specific target drumset to be used as a front end in a complete live drum understanding system. Our system decomposes drum note onsets onto spectral drum templates by adapting techniques from non-negative matrix factorization. Multiple templates per drum are computed using a new Gamma mixture model clustering procedure to account for the variety of sounds that can be produced by a single drum. This clustering procedure imposes an Itakura-Saito distance metric on the cluster space. In addition, we utilize “tail” templates for each drum which greatly improve the separation accuracy when cymbals with long decay times are present.
منابع مشابه
Multi-Template Shift-Variant Non-Negative Matrix Deconvolution for Semi-Automatic Music Transcription
For the task of semi-automatic music transcription, we extended our framework for shift-variant non-negative matrix deconvolution (svNMD) to work with multiple templates per instrument and pitch. A k-means clustering based learning algorithm is proposed that infers the templates from the data based on the provided user information. We experimentally explored the maximum achievable transcription...
متن کاملItakura-Saito Divergence Non Negative Matrix Factorization with Application to Monaural Speech Separation
Monaural source separation is an interesting area that has received much attention in the signal processing community as it is a pre-processing step in many applications. However, many solutions have been developed to achieve clean separation based on Non-Negative Matrix Factorization (NMF). In this work, we proposed a variant of Itakura-Saito Divergence NMF based on source filter model that ca...
متن کاملTopological Data Analysis with Bregman Divergences
Given a finite set in a metric space, the topological analysis generalizes hierarchical clustering using a 1-parameter family of homology groups to quantify connectivity in all dimensions. Going beyond Euclidean distance and really beyond metrics, we show that the tools of topological data analysis also apply when we measure distance with Bregman divergences. While these divergences violate two...
متن کاملBlind Audio Source Separation Exploiting Periodicity and Spectral Envelopes
In this paper we focus on the use of windows in the frequency domain processing of data for the purpose of spectral parameter estimation. Classical frequency domain asymptotics replace linear convolution by circulant convolution leading to approximation errors. We show how the introduction of windows can lead to slightly more complex frequency domain techniques, replacing diagonal matrices by b...
متن کاملSpeech recognition based on Itakura-Saito divergence and dynamics/sparseness constraints from mixed sound of speech and music by non-negative matrix factorization
We considered a speech recognition method for mixed sound, which is composed of both speech and music, that only removes music based on non-negative matrix factorization (NMF). We used Itakura-Saito divergence instead of Kullback-Leibler divergence to compare the cost function, and the dynamics and sparseness constraints of a weight matrix to improve speech recognition. For isolated word recogn...
متن کامل